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1. Introduction 


The evaluation of biological responses to assess and predict the impact of environmental 
changes in ecosystems functioning is receiving increasing attention, and current research 
focuses on overcoming concerns about the specificity of biomarkers (Amiard-Triquet et al., 
2012). Generally, biomarkers are chemicals, metabolites, susceptibility characteristics, or 
physiological changes that relate to the exposure of an organism to a chemical. Accordingly, 
a selected biomarker, i.e. biological response, can be linked to a specific environmental 
exposure, being representative of the health status of the ecosystem studied. The identification 
of biomarker profiles has been possible upon the development of metabolomics. Those profiles 
allow the genuine identification of the relevant biological response/s associated to a particular 
exposure while the assessment of a single biomarker could only estimate the potential response 
of the ecosystem to a particular pollutant. 


Metabolomics is the generic name assigned to a scientific field that addresses the characteri- 
zation of low molecular weight organic metabolites released by living organisms in response 
to environmental stimuli. Morrison et al. (2007) provided an extended definition "'the appli- 
cation of metabolomics to the investigation of both free-living organisms obtained directly 
from the natural environment (whether studied in that environment or transferred to a 
laboratory for further experimentation) and of organisms reared under laboratory conditions 
(whether studied in the laboratory or transferred to the environment for further experimen- 
tation), where any laboratory experiments specifically serve to mimic scenarios encountered 
in the natural environment". 


The methodological approach of metabolomics relies on a comprehensive analysis of the set 
of metabolites or ^metabolome" produced in response to particular environmental stimuli. 
Accordingly, the metabolome is the pool of metabolites, small molecules, within a cell, tissue, 
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organ, biological fluid, or entire organism (Miller, 2007). Exposure of an organism to an 
external stressor will result in changes at the level of the metabolome (Ankley et al., 2006; van 
Ravenzwaay et al., 2007), and such changes may constitute a highly sensitive indicator of an 
external stress. Therefore, metabolomics has potential as a sensitive and rapid technique that 
can elucidate the relationships between metabolite levels and an external stressor, such as 
contaminant exposure, nutritional deficit or a disease. 


The main advantage of metabolomics over traditional research is to overcome the bias 
associated to the assessment of predefined metabolites (Singh, 2006). Among the diverse 
applications of omic profiling methods to the environmental sciences, ecotoxigenomics 
addresses the response of organisms to pollutants based on the different sensitivity of species 
to toxicants (Spurgeon et al, 2008). Currently, the implementation of metabolomics in 
ecological risk assessment is still at an early stage, mostly applied as a screening tool to assess 
the potential toxic effects of pollutants or to determine the mode of action (MOA) of a toxicant. 
Otherwise, application of metabolomics for environmental monitoring allows the study of a 
large variety of species from relatively uncontrolled environments. 


Bundy et al. (2009) highlight the challenge of identifying a large number of metabolites and 
the necessity of creating metabolite databases specifically dedicated to environmental issues. 
Multivariate statistical analysis has proved highly effective for metabolite identification. 
Thus, principal component analysis (PCA) is used to identify differences between metabol- 
ic profiles of organisms exposed to organic or inorganic pollutants (Jones et al., 2014; Kwon 
et al., 2012; Lankadurai et al., 2011). Besides, association between the metabolic profile and 
biological factors evaluated as markers for exposure to pollutants can be modelled by partial 
least squares (PLS) regression analysis (Ellis et al., 2012). 


The implementation of metabolomics for the assessment of soil contamination is neverthe- 
less at an early stage (Viant, 2009). A basic screening of published research in the web of 
science returns circa 100 items for the search "soil pollution-metabolomics", with a 
significant launch in 2011 (Figure 1), reduced to 21 records when the search is narrowed 
with the term “biomarkers”, published in the period 2007-2013. However, emerging 
regulatory challenges demand the advance of toxicity testing. Toxicogenomics tools have 
been presented as an advanced from the current methodologies used for regulatory decision 
making in ecotoxicology, which entirely rely on whole animal exposures and adverse effects 
on survival, growth, and reproduction (Ankley et al, 2006). From the acquisition of 
reproducible metabolic profiles as response to the presence of specific pollutants in soil 
(Jones et al., 2008) to the application of metabolomics techniques to the study of the response 
of the entire community of a soil to factors such as pollution and climate change (Jones et 
al., 2014), the implementation of metabolomics in ecotoxicology is a sound answer to the 
current needs of society and the environment (van Ravenzwaay et al., 2012). 


During the last decade a number of general revisions about the application of metabolo- 
mics in environmental health assessment have been published (Bundy et al., 2009; Miller, 
2007; Snape et al., 2004; van Ravenzwaay et al., 2007; Viant et al., 2003). The present review 
specifically summarizes the most significant research concerning implementation of 
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Published items in each year Citations in each year 


2002 
2003 


Figure 1. Citation report for the search "soil pollution-metabolomics" as obtained from Thomson Reuters (January 
2014). 


metabolomics in soil contamination assessment. The main objectives of this revision are i) 
to provide a systematized outline for the application of metabolomics in risk assessment of 
soil contamination, ii) to provide a rapid guide to the methodological approaches current- 
ly optimized and iii) to unify and simplify the knowledge currently available in the topic 
to provide an accessible tool for further advance in the implementation of metabolomics in 
risk assessment. 


2. Methodological approaches 


2.1. Metabolites isolation 


Generally, metabolites are extracted from intact organisms (occasionally from selected relevant 
tissues) that have been exposed to the studied toxicant by moderate chemical extraction 
(Baylay et al., 2012; Yuk et al., 2010). The organisms commonly selected for toxicity testing are 
earthworms (Table 1), particularly the genus Eisenia, which are a classic model organism for 
toxicity assays (Sanchez-Hernandez, 2006; Van Gestel et al., 1992; van Gestel et al., 1989) and 
have been since long included in official guidelines (OECD, 1984, 2004). Earthworms ingest 
large amounts of soil and uptake a significant amount of contaminant through the skin. 
Therefore they are continuously exposed to contaminants. Extractions performed with 
methanol-chloroform (Baylay et al., 2012) or phosphate buffer solution (Yuk et al., 2010) on 
pulverized or lyophilised organisms are described to extract the maximum number of 
metabolites while allowing the performance of reliable analyses. 


The isolated extracts usually might not require further sample treatment prior to analysis, 
which minimizes the introduction of artefacts but also facilitates the development of low cost, 
rapid methodologies. 
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2.2. Metabolites determination: chromatography, spectroscopy and spectrometry 


The leading analytical techniques in metabolomics for soil contamination assessment are 
proton nuclear magnetic resonance spectroscopy ('H NMR) and gas chromatography-mass 
spectrometry (GC-MS), as thoroughly reported in Table 1, both allowing the identification of 
compounds at molecular level in the analysed substances. Several authors have also imple- 
mented high pressure liquid chromatography (HPLC) and ultra high pressure liquid chro- 
matography (UPLC) coupled with mass spectrometry detector (MS) for the assessment of 
biological responses to soil contamination with heavy metals (Hédiji et al., 2010; Hughes et al., 
2009). Overall, these analytical techniques allow the determination and identification of the 
metabolites that foremost represent the metabolic alterations related to the toxic effects of 


organic or inorganic contaminants in soil. 


Animal model Biomarkers of 


Technique Organic toxicant . Reference 
tested contaminant exposure 
P[2-hexyl-5-ethyl-3- 
1H NMR " y 
"— 2-fluro-4-methylaniline E. veneta furansulfonate, maltose], (Bundy et al., 2002) 
inosine monophosphate 
1H NMR 
3-trifluromethyaniline E. veneta lactate (Lenz et al., 2005) 
GC/MS 
1H NMR : n Ala, Gly, Asn, glucose, 
3-trifluromethylaniline E. veneta : : (Warne et al., 2000) 
GC/MS citrate, succinate] 
P[acetate, malonate], 
1H NMR 
CONE 3-fluro-4-nitrophenol E. veneta succinate, trimethylamine- (Bundy et al., 2001) 
N-oxide] 
»2-hexyl-5-ethyl-3- 
1H NMR y ý 
Stm 3,5-difluroaniline E. veneta furansulfonate, 'inosine (Bundy et al., 2002) 
monophosphate 
1H NMR 7 P[Maltose, hexyl-5-ethyl-3- 
4-fluroaniline E. veneta (Bundy et al., 2002) 
GC/MS furansulfonate] 
1H NMR Aroclor 1254 E. fetida No significant changes (Fitzpatrick et al., 1992) 
1H NMR Caffeine E. fetida fumarate (McKelvie et al., 2011) 
1H NMR Carbamazephine E. fetida P[Fumarate, Glu, Val, °Leu] McKelvie et al. (2011) 
(Heimbach, 1988) 
1H NMR Carbaryl E. fetida P[Phe, Tyr, Lys, Ala, Val, Leu] (Edwards and Bater, 
1992) 
1H NMR Chlorpyrifos E. fetida No significant changes (Yu et al., 2006) 
1H NMR Chlorpyrifos L. rubellus ‘fumarate (Baylay et al., 2012) 


Animal model 
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Biomarkers of 


Technique Organic toxicant . Reference 
tested contaminant exposure 
1H NMR Ala, betaine, ornithine], 
spectroscopy, Chlorpyrifos C. elegans P[choline, Glu, Gly, Ile, (Baylay et al., 2012) 
GC-MS lactate, n-buterate, taurine] 
. . P[Phenylalanine, Alanine, (Drewes and Vining, 
1H NMR Dimethyl phthalate E. fetida . ; 
Leucine, Valine] 1984) 
1H NMR 
DTT and Endosulfan E. fetida ‘maltose, Ala, Leu] (McKelvie et al., 2009) 
GC/MS 
Significant fluctuations in 
1-D & 2-D 'H 
: glutamine/GABAeglutamate 
NMR Endosulfan E. fetida . (Yuk et al., 2013) 
cycle metabolites and 
Spectroscopy ao 
spermidine 
Significant fluctuations in 
1-D & 2-D 'H 
: glutamine/GABAeglutamate 
NMR Endosulfan Sulfate E. fetida . (Yuk et al., 2013) 
cycle metabolites and 
spectroscopy n 
spermidine 
1H NMR Estrone E. fetida PIAdenine, Glu] McKelvie et al. (2011) 
. : PToxicological endpoints 
1H NMR Imidacloprid/ . . 
. . . L. rubellus (survival, weight loss, and Baylay et al. (2012) 
GC-MS analysis Thiacloprid : 
reproduction) 
1H NMR Naphthalene E. fetida P[Ala, Leu, Val, Lys] (Brown et al., 2009) 
1H NMR Nonylphenol E. fetida P [adenine, Glu] McKelvie et al 2011 
Polybrominated 
1H NMR diphenyl ethers (PBDE) E. fetida PMaltose, '[Lys, Glu] McKelvie et al. (2011) 
209 
‘succinate, HEFS, Glu], P[leu, 
1H NMR Perfluorooctanoic acid £. fetida Val, LyS, Phe, Arg, maltose, (Lankadurai et al., 2012) 
ATP] 
‘succinate, HEFS, Glu], P[leu, 
Perfluorooctane . J 
1H NMR E. fetida Val, Lys, Phe, Arg, maltose, | (Lankadurai et al., 2012) 
sulfonate 
ATP] 
Leu, Val, Ala, Lys and maltose 
1H NMR PHA E. fetida changed in response to PAH (Brown et al., 2009) 
exposure. 
Ala, betaine, Scyllo- and 
. CU (Lankadurai et al., 2011) 
1H NMR Phenanthrene E. fetida myo-inositol, cholesterol, 


f . Brown et al. (2010) 
phosphatidylcholine], PGlu 
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Animal model 


Biomarkers of 


Technique Organic toxicant . Reference 
tested contaminant exposure 
: McKelvie et al. (2011) 
Polychlorinated . mE 
1H NMR . E. fetida IATP (Whitfield Aslund et al., 
biphenyl 
2011) 
Poly brominated 
1H NMR : E. fetida Ilys, Glu], Pmaltose McKelvie et al. (2011) 
diphenyl ethers 
lactate, tetradecanoic acid, 
1H NMR hexadecanoic acid, 
Pyrene L. rubellus . . (Jones et al., 2008) 
GC/MS octadecanoic acid], '[Ala, Leu, 
Val, lle, Lys, Tyr, methionine] 
1-D & 2-D 'H 
NMR Rifluralin E. fetida Ala, Gly, maltose, ATP] Yuk et al. (2011) 
spectroscopy 
. . No significant changes were 
1H NMR Thiacloprid L. rubellus Baylay et al. (2012) 
reported 
1-D & 2-D 'H 
NMR Trifluralin E. fetida Ala, Gly, ATP], Pmaltose Yuk et al. (2011) 
spectroscopy 
1H NMR 
Iphytochelatins, 
spectroscopy andCd C. elegans oe (Hughes et al., 2009) 
cystathionine 
UPLC-MS 
‘choline, P[glucose, citrate, 
1H NMR, HPLC- . malate, glutamine, mE 
S. lycopersicum : (Hédiji et al., 2010) 
PDA asparagine, Phe, Tyr, Val, lle, 
trigonelline] 
Phe, glucose, malate, 
arachidonic acid, fumarate, 
Lys, Tyr, monosaccharides, 
1H NMR, GC-MS : . monophosphorylated form 
: Chlorpyrifos + Ni L. rubellus A : . (Baylay et al., 2012) 
analysis of inositol, succinate, uracil, 
ethanolamine, Pro, 
putrescine], 
Pmyo-inositol, 
1H NMR Ala, creatine, His, lactate, 
spectroscopy, Chlorpyrifos + Ni C. elegans betaine, carnosine], (Jones et al., 2012) 
GC-MS P[choline, Gly, Ile, leu, lys, Val] 
1H NMR Cu (Il) L. rubellus His (Gibb et al., 1997) 
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. ; . Animal model Biomarkers of 
Technique Organic toxicant . Reference 
tested contaminant exposure 


'H NMR 

Asn, choline, Gln, lactate, 
spectroscopy, Ni C. elegans . Jones et al. (2012) 
are succinate], "[Ala, Ile, Gly, Val] 


i) Exposure to Pb - "Tyr. 
ii) Exposure to Zn - '[Val, lle, 
Leu, Thr, Ala, Asn, Phe, 
fosfocoline], P[acetate, Gly, 
glucose, fumarate and 
1H NMR Pb, Zn S. salsa ferulate]. (Wu et al., 2013) 
iii) Exposure to Pb & Zn - 
Ala, Asn, Tyr, Phe], P[acetate, 
succinate, Asp, malonate, 
fructose, glucose, fumarate 


and ferulate] 


Mixed metals 
1H NMR (Cadmium, copper, lead L. rubellus ! [maltose, His] (Bundy et al., 2004) 


and zinc) 


I[Thr, Leu, Tyr, betaine, Ser, 
. . Lys, lle, Ala, Arg, Val, . 
1H NMR Tellurite P. pseudoalcaligenes . . (Tremaroli et al., 2009) 
glutathione, adenosine], 


P[Lactate, Gly] 


! [Phe, Tyr, Lys, Glu, Ala, (Whitfield Aslund et al., 


lactate, Val, Leu], Pmaltose 2012) 


1H NMR Titanium dioxide E. fetida 


D = decrease; | = increase 


Table 1 Metabolic responses of test organism following exposure to selected environmental contaminants in contact 
tests. 


2.3. Metabolomics data analysis 


The general approach to data analysis in metabolomics can be summarized in three main 
stages: explorative, supervised and biological interpretation (Smilde et al., 2010). The explor- 
ative phase aims to find groups, clusters and outliers in metabolites and samples studied while 
the supervised discriminates two or more groups to make predictive models and to find 
biomarkers (Amiard-Triquet et al., 2012; Dallinger-Marianne, 2000; van Ravenzwaay et al., 
2007). Multivariate methods are currently preferred, although univariate and semi-univariate 
methods have been commonly used for selecting biomarkers. For instance, the lysosomal 
system was identified as a particular target for the toxic effects of pollutants in soil organisms. 
However, it is nonspecific as a marker and only included in a suite of biomarkers among 
diverse soil invertebrate species can provide the necessary specificity for risk assessment 
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purposes (Kammenga et al., 2000). Finally, the biological interpretation seeks the links between 
metabolome data and underlying metabolic networks through metabolite set enrichment, 
pathway analysis and metabolic network inference (Trygg et al., 2006). Thus, finding metab- 
olite relationships is essential to determine comprehensive and meaningful metabolic changes 
as biological response to environmental stimuli (Ellis et al., 2012; Morrison et al., 2007). 
Accordingly, such extensive evaluation of the impact of pollutants in the metabolism of target 
organisms is the approach that can add value to the assessment of soil health and viability of 
soil organisms undergoing stress from pollution. 


3. Metabolomics bioinformatics 


Information processing by bioinformatics tools and computational biology methods has 
become essential for solving complex biological problems in genomics, proteomics, and 
metabolomics. Understanding "omics" data requires both common statistical and computa- 
tional based methods due to the multi-dimensional and complexity level of the data. 


Data-analytical methods for the study of biological systems as developed in the field of 
computational biology provide a suit of indispensable tools to survey the outcome of metab- 
olomics studies. First, computational biology allows a fast screening of the large biological and 
chemical data sets generated (Shulaev, 2006), and therefore the identification of the most 
relevant metabolites, i.e. compounds specifically representative of the metabolic changes in 
the model system following exposure to different concentrations of organic and inorganic 
toxicants. As a result of the large number of variables (metabolites) studied, metabolomics 
studies encompass a significant statistical power for the systematic detection of biological 
responses to environmental changes (van Ravenzwaay et al., 2012). Second, the mathematical 
models developed in computational biology allow the identification of relationships between 
the external stimuli and the metabolic response (Zhang et al., 2010). Third, the implementation 
of computational algorithms to structural biology makes possible to discover the structure- 
function of new macromolecular compounds, the functional enzymatic conversion and 
changes in their activity, as well as their molecular interaction and relationship with others 
compounds in the pathways where they are involved (Jimenez-Lopez et al., 2013). Moreover, 
it is possible to detect patterns in such biological responses and establish significant dose- 
response relationships. Besides, pattern recognition reduces the metabolomics data from 
hundreds of variables to two or three components that are orthogonal to each other. Overall, 
this advance of computational biology has been possible due to three significant technological 
breakthroughs: high-information-content data streams, novel bio-statistical methods, and the 
computational power to analyse these data. 


Data processing and statistical analyses are commonly performed using multivariate (typically 
a principal component analysis (PCA) and (or) partial least squares (PLS) regression analysis) 
and univariate (t-test) analyses (Brown et al., 2010; Jones et al., 2014; McKelvie et al., 2011; Yuk 
et al, 2013). These analyses are performed in combination with the quantification and 
identification of the metabolites. Subsequently, biological interpretation of the data is neces- 
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sary for understanding the link between the external stimulus and the metabolic response of 
the organisms. 


Principal component analysis is the most widely used multivariate statistical approach in 
metabolomics, used to explain the overall variability in a data set via a a set of uncorrelated 
variables called principal components (PCs), which are linear combinations of the original 
variables (Trygg et al., 2006). The organization of samples in PCA scores plots is based on the 
similarities between their metabolic profiles. Thus, PCA allows for dimensional reduction of 
the data into a low dimensional plane, such as PC1 versus PC2. The scores plot (e.g., PC1 versus 
PC2) allows for a visual examination of the relationship between the samples based on their 
metabolic profiles. In a 1-D PCA loadings plot, the contribution (or weight) of each metabolite 
to the discrimination of the sample classes along one component is represented by the intensity 
of the metabolite peak. In the 2-D PCA loadings plot discrimination is performed by selecting 
the points that are scattered further away from the tight cluster of points found near the origin. 


Other widely used multivariate statistical tools in metabolomics are PLS regression analysis 
and PLS discriminant analysis (PLS-DA). Both PLS-regression and PLS-DA are methods for 
samples classification, with pre-defined variables added to maximize the separation between 
the sample classes and to construct predictive models. The predefined variables for PLS- 
regression are measurable quantities such as the contaminant exposure concentration. 
Validation methods such as the leave-one-out cross validation are used to test the robustness 
of the models generated by PLS-regression, PLS-DA, OPLS, and OPLS-DA (Whitfield Aslund 
et al., 2011). 


Although metabolomics studies mostly use multivariate statistics, univariate statistical 
analyses can contribute to the information gained from a study. Thus, t-tests can be used to 
assess the significance of the separation between the controls and stressed organisms in PCA 
and PLS-DA scores plots. Also, t tests can be used to determine which metabolites in the 'H 
NMR spectra of the treatment class increased or decreased significantly relative to the controls. 


4. Biomarkers 


The somewhat secondary significance of biological responses for soil contamination assess- 
ment was customarily associated to the limitation of biomarkers as measurable responses to 
contaminants, which classically could only provide an indication of exposure to contaminants 
in soil (Sanchez-Hernandez, 2006). The development of metabolomics, considered an "emerg- 
ing field" as late as mid-2010, has provided the tools for the determination of multiple 
biomarkers across different levels of biological organization, and therefore a better assessment 
of the ecological consequences of contamination. Since the creation of the first metabolomics 
web database, METLIN (Smith et al., 2005), 60,000 metabolites has been incorporated, a rapid 
development closely related to the evolution of mass spectrometry instrumentation and data 
analysis tools. Currently, the number of databases and metabolites registered is continuously 
increasing. Table 2 summarizes some of the most relevant databases operative and the 
corresponding website is also indicated. Further information on metabolomics databases can 
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be obtained from the metabolomics society (http://www.metabolomicssociety.org). For 
instance, ChemSpider is an aggregated database of organic molecules containing more than 
20 million compounds from many different providers. At present the database contains 
information from such diverse sources as a marine natural products database, ACD-Labs 
chemical databases, the EPA's DSSTox databases and from a series of chemical vendors. It has 
extensive search utilities and most compounds have a large number of calculated physico- 
chemical property values. 


One of the goals in bioinformatics is to establish automated and efficient ways to integrate 
large, biological datasets from multiple sources. This objective is challenging because data 
sources are heterogeneous in terms of their functions, structures, data access methods and 
dissemination formats. In addition, the enormous quantity of information produced by 
“omics” is handled via computers that systematically analyze and store the accumulating 
sequence, structure and function data. Databases are essential in metabolomics because they 
provide a rapid and specific tool to identify the compounds isolated from an organism exposed 
to a particular environmental challenge. Thus, the KNApSAcK package provides tool for 
analysing datasets of mass spectra as well as for retrieving information on metabolites by 
entering the name of a metabolite, the name of an organism, molecular weight or molecular 
formula. A list of metabolites that are associated to a taxonomic class can be obtained by search 
with the taxonomic name, from which information of individual metabolites can be retrieved. 
The NIST Chemistry WebBook provides access to chemical and physical property data for 
chemical species. The data provided in the site are from collections maintained by the NIST 
Standard Reference Data Program and outside contributors. Data in the NIST Chemistry 
WebBook can be found by direct searches for chemical species or indirect searches based on 
related data. Specific databases are also being developed, such as LIPID MAPS, currently the 
largest database of lipid molecular structures. Otherwise, SetupX combines mass spectrometric 
and biological metadata, which is a step forward in the organization of information generated 
by metabolomics analysis. 


METLIN http://metlin.scripps.edu/index.php 

LIPID MAPS http://www.lipidmaps.org/ 

KEGG http://www.genome .jp/kegg/pathway.html 
ChemSpider http://www.chemspider.com/ 

SetupX http://fiehnlab.ucdavis.edu/projects/binbase_setupx 
KNApSAcK http://kanaya.naist.jo/KNApSAcK/ 

NIST http://webbook.nist.gov/chemistry/ 

MassBank http://www.massbank.jp/ 

HMP http://www.hmdb.ca/ 

IIMDB http://metabolomics.pharm.uconn.edu/iimdb/ 


Table 2 Selected metabolomic databases. 
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Metabolomic databases are thus accompanied by accurate description of the biological study 
design and accompanying metadata reporting on the laboratory workflow from sample 
preparation to data processing. 


Currently, standard analyses focus on the determination of amino acids, mono- and disac- 
charides, lipids/fatty acids, short chain fatty acids and small phenolics. Accordingly, it is 
possible to already launch the standardization of metabolomics analysis. For instance, the 
Northwest Metabolomics Research Center (University of Washington) has established a 
relevant list of target compounds to evaluate biological responses to changes in the environ- 
ment. The list of compounds is summarized in Table 3. 


Metabolic Pathways Number of Metabolites 
Alanine, aspartate and glutamate metabolism 15 
Arginine and proline metabolism 23 
Butanoate metabolism 18 
Citrate cycle (TCA cycle) 11 
Cysteine and methionine metabolism 14 
Fatty acid metabolism 3 
Glutathione metabolism 14 
Glycine, serine and threonine metabolism 21 
Glycolysis / Gluconeogenesis 16 
Histidine metabolism 13 
Lysine biosynthesis 7 
Lysine degradation 6 
Nitrogen metabolism 9 
Oxidative phosphorylation 6 
Pentose phosphate pathway 10 
Phenylalanine metabolism 10 
Phenylalanine, tyrosine and tryptophan biosynthesis 8 
Purine metabolism 30 
Pyrimidine metabolism 30 
Pyruvate metabolism 10 
Synthesis and degradation of ketone bodies 4 
Tryptophan metabolism 15 
Tyrosine metabolism 18 
Valine, leucine and isoleucine biosynthesis 11 
Valine, leucine and isoleucine degradation 5 


Table 3 Summary of metabolites and metabolic pathways representative of biological responses to environmental 
stimuli. 


The information of metabolites and metabolic pathways has been obtained from the website 
of Kyoto Encyclopedia of Genes and Genomes (Kegg, http://www.genome jp/kegg/). Accord- 
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ing to the research results summarized in Table 1, the implementation of metabolomics in the 
assessment of soil contamination indicates that contaminants in soil affect several of the major 
metabolic pathways in living organisms (Table 3), including glycolysis, trycarboxylic acids 
cycle and amino acids metabolism. Moreover, data analysis indicates an overall reduction in 
the production of the associated metabolites. For instance, the interference in amino acids 
specialized pathways results in a decreased synthesis of purine and pyrimidine nucleotides 
(Brown et al., 2010; McKelvie et al., 2011). These nucleotides are essential for the production 
of the energy (ATP molecules) that drive most of the enzymatic reactions in living organisms, 
but also protein synthesis is consequently hampered, which explain the negative effect in 
processes such as antioxidant activity. 


Another emerging group of biomarkers, as highlighted in several studies, are lipids (Rochfort 
et al., 2009; Sanchez-Hernandez, 2006). Rochfort et al., (2009) indicate that lipophilic extracts 
can be used in field based metabolomics experiments to investigate different treatment effects 
on earthworms. Lipid metabolism is highly sensitive to environmental contaminants (Vega- 
López et al., 2013), with increasing production of lipoprotein vesicle and lipid peroxidation 
rate during early stages of the biological response to the presence of a toxicant (Lankadurai et 
al., 2011). Relatedly, earthworm esterases has been proposed as biomarkers for pesticide 
contamination in soil (Sanchez-Hernandez, 2010). Esterases are directly involved in the natural 
tolerance of earthworms to pesticides, and can therefore be used as specific biomarkers, but 
furthermore, their characterization by metabolomics approach might help to select the 
appropriate earthworm species for regulatory toxicity testing. Overall, the increasing specif- 
icity of the research performed in ecotoxigenomics will allow a realistic and meaningful 
incorporation of biological responses in ecological risk assessment. 


5. Oxidative stress in contaminated soil 


The induction of the oxidative stress response by the presence of toxic compounds in the 
environment is a primary mechanisms of defence, although prolonged exposure to contami- 
nants is likely to overwhelm this short-term defence (Regoli et al., 2002). 


Metabolites such as proline possibly detoxify the ROS under stress in vivo (Smirnoff, 1993). 
Exposure of plants to both redox active, for example, Cu and Hg, and other metals, for example, 
Cd and Zn, induces the generation of free radicals that leads to oxidative stress. This represents 
one of the major causes of toxicity particularly due to redox metals. The cells are equipped 
with an elaborate network of antioxidative enzymes and low molecular weight metabolites 
which mitigate the oxidative stress. Proline scavenges different free radicals in certain in vitro 
generation and detection systems. 


Proline quenches ROS and reactive nitrogen species (RNS), which relieves the oxidative 
burden from the glutathione system. Moreover, polyamines also have an antioxidative role by 
quenching the accumulation of O,- probably through inhibition of NADPH oxidase (Pascha- 
lidis and Roubelakis-Angelakis, 2005). This may facilitate phytochelatin synthesis and enhance 
metal tolerance (Siripornadulsil et al., 2002). 
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Overall, oxidative defence response to toxicity or other environmental stress involves the 
generation of oxygenated metabolites from exposed organisms and activation/inhibition of the 
production of antioxidants enzymes and metabolites such as glutathione. The depletion of 
antioxidants for prolonged exposures might result in the decrease of the response effectiveness 
and eventual imbalance between generation and elimination of reactive oxygen species. 
Depletion of glutathione appears to be a major mechanism in short-term heavy metal toxicity 
(Schutzendubel and Polle, 2002). In accordance with this hypothesis, a good correlation 
between glutathione contents and tolerance index was observed with 10 pea genotypes 
differing in Cd sensitivity (Metwally et al., 2005). High GSH concentrations in hyperaccumu- 
lator T. Goesingense coincided with high constitutive activity of serine acetyl transferase (SAT); 
SAT catalyses the acetylation of L-Ser to OAS which in turn provides the carbon skeleton for 
Cys biosynthesis. Elevated GSH levels in T. Goesingense also coincided with the ability both to 
hyperaccumulate Ni and to resist its damaging oxidation effects. 


The significance of glutathione and the metal-induced phytochelatins (PCs) in heavy metal 
tolerance has been studied intensely (Rauser, 1995). However, PCs are important for detoxi- 
fication of only a limited set of metals such as Cd”, Cu** and AsO,* while Zn™ and Ni? are 
poor inducers of PCs and exhibit low binding affinity. Most other metals lack significant 
binding. 


Evaluation of metabolites related to oxidative response constitutes a relevant group of target 
compounds for risk assessment. Although oxidative response to soil contamination has been 
classically addressed in plants, the study of this response in soil microorganisms is already 
being introduced in ecotoxicology as a fundamental part of the biological response of soil 
microorganisms to soil contamination (Boer et al., 2013; Tremaroli et al., 2009). Accordingly, 
Boer et al. (2013) describe the attenuation of the oxidative response for springtails in laboratory 
tests, which constitutes and early detection of soil pollution, and standardized test have been 
developed. 


6. Metabolites related to soil contamination with organic compounds 


The importance of the identification of biomarkers and metabolic pathways specifically related 
to soils contamination with a particular pollutant or group of pollutants has been already 
highlighted through this chapter. From the information summarized in Table 1 and Table 3 it 
is possible to infer that soil contamination with organic compounds, namely pesticides o 
polycyclic aromatic hydrocarbons, abates essential metabolic pathways such as the trycarbox- 
ylic acid cycle and the oxidative stress response, while lipid metabolism appears to be 
enhanced. However, the advance in the application of bioinformatics is providing further 
progress in terms of identification of specific biomarkers for risk assessment of individual 
target compounds. Thus, toxicity of endosulfan has been directly related with alterations of 
the GABA-glutamine cycle (Yuk et al., 2013), while chlorpyrifos depresses the Cori cycle and 
reduces the production of phospholipids, as indicated by lower levels of choline (Jones et al., 
2012). Baylay et al. (2012) specifically relates chlorpyrifos toxicity to increased levels of 
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fumarate, an intermediate of the trycarboxylic acid cycle. Research conducted with the same 
earthworm (E. fetida) and other families of organic compounds revealed a different metabolic 
response (Brown et al., 2010; Lankadurai et al., 2012), confirming the capability of metabolo- 
mics to discriminate the metabolic pathways involved in the response to a particular toxic 
compound. Moreover, the results strongly suggest that sets of biomarkers might be soon 
sufficiently reliable as for their implantation in in toxicity standardized test. 


The relevance of these and future studies on the development of risk assessment strategies is 
aggravated by the inherent risk of soil contamination for human health. Soil contaminants may 
be responsible for health effects costing millions of euros. Health problems range from cancer 
(arsenic, asbestos, dioxins), to neurological damage and lower IO (lead, arsenic), kidney 
disease (lead, mercury, cadmium), and skeletal and bone diseases (lead, fluoride, cadmium). 


Overall, few studies have been conducted on the toxicity of complex chemical mixtures in soils. 
The effects of the soil and organisms within it upon organic pollutants are unknown. The data 
currently available correspond mostly to short-term studies and high level exposure of these 
chemicals, which is less relevant to the potential low-level, long term health impacts on living 
organisms near to contaminated soil. 


7. Metabolites related to soil contamination with heavy metals 


The uptake of excess metal ions is toxic to most organisms, and the biochemical impact of metal 
ions on the cells varies with the chemistry of the element as their chemical nature. In plants, 
phytotoxicity of heavy metals in most parts can be attributed to symplastic accumulation of 
heavy metals, such as the cytosol and chloroplast stroma. Metal-induced changes in develop- 
ment are the result of either a direct and immediate impairment of metabolism or signaling 
processes that initiate adaptive or toxicity responses that need to be considered as active 
processes of the organism. Transport processes have been recognized as a central mechanism 
of metal detoxification and tolerance (Hall, 2002; Hall and Williams, 2003). 


Some metals, for example, Zn and Cu, are essential for normal plant growth and development 
as they serve as structural and functional components of specific proteins. Other metals, for 
example, Cd and Pb, have no known function in plants although a Cd requirement for carbonic 
anhydrase from marine diatoms has been reported (Lane and Morel, 2000). 


Upon exposure to metals, organisms often synthesize a set of diverse metabolites that accu- 
mulate to concentrations in the millimolar range, particularly specific amino acids, such as 
proline and histidine, peptides such as glutathione and phytochelatins (PC), and the amines 
spermine, spermidine, putrescine, nicotianamine, and mugineic acids that can be detected as 
response to these metals exposure. The advance of toxicogenomics in relation to organic 
contaminants is significantly ahead of the equivalent research in metal contaminated soil 
(Table 1). Nevertheless, research conducted up to date has yielded a number of biomarkers 
representative of the biological response of soil microorganisms to metals toxicity. Thus, soil 
contamination with Pb has been related with an enhancement of lipid metabolism (Sanchez- 
Hernandez, 2006) and more directly with reduction of tyrosine levels (Wu et al., 2013). 
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Otherwise, Cd toxicity promotes the secretion of phytochelatins in C. elegans, likely at the 
expenses of the sulphur metabolism, as suggested by the reduction in cystathionine (Hughes 
et al., 2009), while the response of tomato plants to Cd involves several biochemical pathways 
(Hédiji et al., 2010). These examples illustrate the genuine specificity of biological reactions to 
different metals but also the variation in representative biomarkers among different organ- 
isms. Accordingly, exposure of C. elegans to Ni (Jones et al., 2012) yields a different metabolome 
than Cd since different biochemical pathways are affected. 


In plants, data currently available demonstrate the significance of nitrogen-containing 
metabolites beyond phytochelatins and glutathione in plant response and acclimation to heavy 
metals. The various metal ions have specific chemical properties and induce distinct responses 
of adaptation and damage development. Thus, accumulating N-metabolites display a variety 
of functions, i.e. metal ion chelation, antioxidant defence, protection of macromolecules, and 


possibly signalling. 


Proline is an extensively studied molecule in the context of plant responses to abiotic stresses. 
Up-regulation of proline is often encountered in plants under heavy metal stress, comparable 
to what occur under other abiotic stresses. When compared at equal toxic strength, proline 
accumulation decreased in the order Cd » Zn » Cu (Schat et al., 1997). In addition, it has been 
suggested different functions of proline under metal-stress, being involved in osmoregulation, 
metal chelation, antioxidant, and regulator of specific functions in plant morphogenesis. 


Furthermore, Ni-hyperaccumulation has been specifically linked to histidine production 
(Kramer, 2005), particularly for Saccharomyces cerevisiae (Pearce and Sherman, 1999). The 
beneficial role of high histidine levels has been shown in transgenic Arabidopsis thaliana which 
accumulated about 2-fold higher histidine levels than wild- type plants and showed more than 
10-fold increased biomass production in the presence of toxic Ni in the growth medium 
(Wycisk et al., 2004).. Moreover, cell surface-engineered yeast displaying a histidine oligo- 
peptide (hexa-His) has been shown to adsorb 3-8 times more copper ions than the parent strain, 
being more resistant to Cu than the parent (Kuroda et al., 2002). 


Otherwise, polyamine contents are altered in response to the exposure to heavy metals. 
Weinstein et al. (1986) showed an increment in putrescine content in Cd-treated oat seedlings 
and detached oat leaves with a marginal rise in spermidine and spermine content. They 
influence a variety of growth and development processes in plants and have been suggested 
to bea class of plant growth regulators and to act as second messengers (Kakkar and Sawhney, 
2002). It has been suggested that they could stabilize and protect the membrane systems against 
the toxic effects of metal ions, particularly the redox active metals. 


Overall, the number of studies remains rather scarce, and the preliminary results available in 
the literature merely constitute a launching platform for this promising research field. 
8. Future perspectives 


The main objective of metabolomics implementation in soil risk assessment is to meet the 
continuously increasing demand of safety data from human and ecological risk assessments. 
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Accordingly, regulatory programs worldwide are currently incorporating tests with end- 
points that involve the effects of chemicals and the impact in specific metabolic pathways 
(Ankley et al., 2006). Toxicological end-points can be general biological responses such as 
survival or weight loss (Baylay et al., 2012), but specific biomarkers provide the accuracy that 
was classically elusive for test with living organisms 


Several issues immediately arise from the summary here presented, such as the need to 
perform field toxicological test, with natural soils rather than use artificial soils, as was the case 
with some of the studies listed in Table 1. Ecotoxigenomics can also benefit from the incorpo- 
ration of further analytical techniques. Techniques based on mass spectrometry are certainly 
required to understand the mechanisms involved in the alteration of metabolic pathways as 
response to toxicants. However, for screenings which merely require the detection of differ- 
ences between metabolic phenotypes, optical methods such as FT-IR would be suitable, 
particularly if extremely high sample throughput is required (Bundy et al., 2009). Although 
no data was available in the existing literature, Figure 2 illustrates the change in the fingerprint 
of organic compounds in a soil amended with different sources of carbon collected 10 after the 
application. While some of the groups of compounds might be merely related to the sources 
of carbon added, the variations in the signal associated to polysaccharides (600-1000 cm”) can 
be associated to changes in the metabolic fingerprint of the soil system and therefore linked to 
microbiological activity in soil. Overall, the introduction of these results seeks to encourage 
further characterization of families of compounds in intact soil (or functional pools such as 
aggregates) in relation with soil processes, an approach that can find immediate application 
in the assessment of biological responses to toxic compounds in soil. 
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Figure 2. Absorption spectra obtained by Fourier transform infrared spectroscopy (FTIR) for an agricultural soil (S), soil 
amended with fresh residues (a): dry leaf litter (SL), peanut shell (SP), maize residue (SM), and soil amended with bio- 
char (b) derived from those feedstocks (BL, BP or BM). Spectra presented (after 10 d incubation) are the average of 5 
spectra obtained for different samples of each treatment. Hernandez-Soriano et al., unpublished data. 


The variability of biological responses has been one of the main obstacles for their implemen- 
tation in standardized risk assessment. However, the examination of changes in biological 
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processes by accurate analytical techniques and powerful statistical tools has launched a new 
era in our understanding of the soil processes. The possibility of identifying the most sensitive 
metabolites for a certain toxicant and develop a tailored standardized test is the ultimate goal 
pursued. 
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